Search CORE

62 research outputs found

Does Size Matter – How Much Data is Required to Train a REG Algorithm?

Author: Koolen Ruud
Krahmer Emiel
Theune Mariët
Wubben Sander
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2011
Field of study

In this paper we investigate how much data is required to train an algorithm for attribute selection, a subtask of Referring Expressions Generation (REG). To enable comparison between different-sized training sets, a systematic training method was developed. The results show that depending on the complexity of the domain, training on 10 to 20 items may already lead to a good performance

CiteSeerX

University of Twente Research Information

Tilburg University Repository

NeuralREG: An end-to-end approach to referring expression generation

Author: Ferreira Thiago Castro
Krahmer Emiel
Kádár Ákos
Moussallem Diego
Wubben Sander
Publication venue
Publication date: 01/01/2018
Field of study

Traditionally, Referring Expression Generation (REG) models first decide on the form and then on the content of references to discourse entities in text, typically relying on features such as salience and grammatical function. In this paper, we present a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and content in one go without explicit feature extraction. Using a delexicalized version of the WebNLG corpus, we show that the neural model substantially improves over two strong baselines. Data and models are publicly available.Comment: Accepted for presentation at ACL 201

arXiv.org e-Print Archive

Crossref

Tilburg University Repository

Data-driven sentence simplification: Survey and benchmark

Author: Abend Omri
Alva-Manchego Fernando
Alva-Manchego Fernando
Artetxe Mikel
Bach Nguyen
Bahdanau Dzmitry
Bingel Joachim
Biran Or
Bott Stefan
Bott Stefan
Bott Stefan
Carolina Scarton
Carroll John
Caseli Helena M.
Coster William
Coster William
Damay Jerwin Jan S.
De Belder Jan
Devlin Siobhan
Eom Soojeong
Feblowitz Dan
Fernando Alva-Manchego
Ganitkevitch Juri
Glavaš Goran
Gonzalez-Dios Itziar
Goodfellow Ian
Goto Isao
Guo Han
Heilman Michael
Kajiwara Tomoyuki
Kandula Sasikiran
Kauchak David
Klaper David
Klein Guillaume
Klerke Sigrid
Lin Chin Yew
Lucia Specia
Mandya Angrosh
Mikolov Tomas
Mirkin Shachar
Napoles Courtney
Narayan Shashi
Niklaus Christina
Ogden Charles Kay
Paetzold Gustavo
Paetzold Gustavo H.
Paetzold Gustavo Henrique
Papineni Kishore
Petersen Sarah E.
Post Matt
Quigley S. P.
Ranzato Marc’Aurelio
Robbins N. L.
Scarton Carolina
Scarton Carolina
Shardlow Matthew
Shewan Cynthia M.
Siddharthan Advaith
Siddharthan Advaith
Silveira Sara Botelho
Snover Matthew
Sun Hong
Vaswani Ashish
Vickrey David
Woodsend Kristian
Woodsend Kristian
Wubben Sander
Yatskar Mark
Zhang Xingxing
Zhu Zhemin
Štajner Sanja
Štajner Sanja
Štajner Sanja
Štajner Sanja
Publication venue: 'MIT Press - Journals'
Publication date: 15/09/2019
Field of study

Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand. In order to do so, several rewriting transformations can be performed such as replacement, reordering, and splitting. Executing these transformations while keeping sentences grammatical, preserving their main idea, and generating simpler output, is a challenging and still far from solved problem. In this article, we survey research on SS, focusing on approaches that attempt to learn how to simplify using corpora of aligned original-simplified sentence pairs in English, which is the dominant paradigm nowadays. We also include a benchmark of different approaches on common datasets so as to compare them and highlight their strengths and limitations. We expect that this survey will serve as a starting point for researchers interested in the task and help spark new ideas for future developments

Crossref

Online Research @ Cardiff

Spiral - Imperial College Digital Repository

White Rose Research Online

SatiricLR: a Language Resource of Satirical News Articles

Author: Frain Alice
Wubben Sander
Publication venue: European Language Resources Association (ELRA)
Publication date: 23/05/2016
Field of study

Tilburg University Repository

Individual Variation in the Choice of Referential Form

Author: Castro Ferreira Thiago
Krahmer Emiel
Wubben Sander
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/06/2016
Field of study

This study aims to measure the variation between writers in their choices of referential form by collecting and analysing a new and publicly available corpus of referring expressions. The corpus is composed of referring expressions produced by different participants in identical situations. Results, measured in terms of normalized entropy, reveal substantial individual variation. We discuss the problems and prospects of this finding for automatic text generation applications

Generating flexible proper name references in text:Data, models and evaluation

Author: Castro Ferreira Thiago
Krahmer Emiel
Wubben Sander
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/04/2017
Field of study

Tilburg University Repository